AITopics | state space size

Planning methods struggle with computational intractability in solving task-level problems in large-scale environments. This work explores leveraging the commonsense knowledge encoded in LLMs to empower planning techniques to deal with these complex scenarios. We achieve this by efficiently using LLMs to prune irrelevant components from the planning problem's state space, substantially simplifying its complexity. We demonstrate the efficacy of this system through extensive experiments within a household simulation environment, alongside real-world validation using a 7-DoF manipulator (video https://youtu.be/6ro2UOtOQS4).

graph, information, llm, (15 more...)

arXiv.org Artificial Intelligence

2409.04775

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Europe > Germany (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Pre-training with Synthetic Data Helps Offline Reinforcement Learning

Wang, Zecheng, Wang, Che, Dong, Zixuan, Ross, Keith

arXiv.org Artificial IntelligenceOct-5-2023

Recently, it has been shown that for offline deep reinforcement learning (DRL), pre-training Decision Transformer with a large language corpus can improve downstream performance (Reid et al., 2022). A natural question to ask is whether this performance gain can only be achieved with language pre-training, or can be achieved with simpler pre-training schemes which do not involve language. In this paper, we first show that language is not essential for improved performance, and indeed pre-training with synthetic IID data for a small number of updates can match the performance gains from pre-training with a large language corpus; moreover, pre-training with data generated by a one-step Markov chain can further improve the performance. Inspired by these experimental results, we then consider pre-training Conservative Q-Learning (CQL), a popular offline DRL algorithm, which is Q-learning-based and typically employs a Multi-Layer Perceptron (MLP) backbone. Surprisingly, pre-training with simple synthetic data for a small number of updates can also improve CQL, providing consistent performance improvement on D4RL Gym locomotion datasets. The results of this paper not only illustrate the importance of pre-training for offline DRL but also show that the pre-training data can be synthetic and generated with remarkably simple mechanisms. It is well-known that pre-training can provide significant boosts in performance and robustness for downstream tasks, both for Natural Language Processing (NLP) and Computer Vision (CV). Recently, in the field of Deep Reinforcement Learning (DRL), research on pre-training is also becoming increasingly popular. An important step in the direction of pre-training DRL models is the recent paper by Reid et al. (2022), which showed that for Decision Transformer (Chen et al., 2021), pretraining with the Wikipedia corpus can significantly improve the performance of the downstream offline RL task. Reid et al. (2022) further showed that pre-training on predicting pixel sequences can hurt performance. The authors state that their results indicate "a foreseeable future where everyone should use a pre-trained language model for offline RL".

arxiv preprint arxiv, dataset, synthetic data, (12 more...)

arXiv.org Artificial Intelligence

2310.00771

Country:

North America > United States > New York (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Add feedback

Pushing the Power of Stochastic Greedy Ordering Schemes for Inference in Graphical Models

Kask, Kalev (University of California, Irvine) | Gelfand, Andrew (University of California, Irvine) | Otten, Lars (University of California, Irvine) | Dechter, Rina (University of California, Irvine)

AAAI ConferencesAug-4-2011

We study iterative randomized greedy algorithms for generating (elimination) orderings with small induced width and state space size — two parameters known to bound the complexity of inference in graphical models. We propose and implement the Iterative Greedy Variable Ordering (IGVO) algorithm, a new variant within this algorithm class. An empirical evaluation using different ranking functions and conditions of randomness, demonstrates that IGVO finds significantly better orderings than standard greedy ordering implementations when evaluated within an anytime framework. Additional order of magnitude improvements are demonstrated on a multi-core system, thus further expanding the set of solvable graphical models. The experiments also confirm the superiority of the MinFill heuristic within the iterative scheme.

algorithm, artificial intelligence, igvo, (15 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States > California > Orange County > Irvine (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Filters

Collaborating Authors

state space size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

94e70705efae423efda1088614128d0b-Paper.pdf

94e70705efae423efda1088614128d0b-Paper.pdf

Leveraging LLMs, Graphs and Object Hierarchies for Task Planning in Large-Scale Environments

Pre-training with Synthetic Data Helps Offline Reinforcement Learning

Pushing the Power of Stochastic Greedy Ordering Schemes for Inference in Graphical Models